A Factored Language Model of Quantized Pitch and Duration
نویسندگان
چکیده
This paper investigates a novel statistical approach to music classification that utilizes recent technology developed in the domain of natural language processing. Specifically, we investigate the use of factored language models (FLMs) for the task of producing conditional probability distributions to model origin-specific folk songs. In our model, pitch cluster and quantized duration are employed as the two fundamental factors in a musical unit. The structure of our FLM is empirically chosen from data to minimize perplexity. We apply this collection of FLMs to the task of folk song classification. Our experiments show classification accuracy of 72.5% on a data set of European folk songs from 6 different regions.
منابع مشابه
The Function of Pitch Range Variations in Samples of Emotional Expressions in Persian
This study aims at investigating the interface between emotion and intonation patterns (more specifically, duration and pitch amplitude of speech). To this end, the acoustic properties of spectral parameters related to speech prosody are investigated. The results of acoustic and Statistical analysis show that mean level and range of FO in the contours vary strongly as a function of the degree o...
متن کاملAn Acoustic Study of Emotivity-Prosody Interface in Persian Speech Using the Tilt Model
This paper aims to explore some acoustic properties (i.e. duration and pitch amplitude of speech) associated with three different emotions: anger, sadness and joy against neutrality as a reference point, all being intentionally expressed by six Persian speakers. The primary purpose of this study is to find out if there is any correspondence between the given emotions and prosody patterning in P...
متن کاملUse of Hidden Markov Models and Factored Language Models for Automatic Chord Recognition
This paper focuses on automatic extraction of acoustic chord sequences from a musical piece. Standard and factored language models are analyzed in terms of applicability to the chord recognition task. Pitch class profile vectors that represent harmonic information are extracted from the given audio signal. The resulting chord sequence is obtained by running a Viterbi decoder on trained hidden M...
متن کاملFactored translation models for enriching spoken language translation with prosody
Key contextual information such as word prominence, emphasis, and contrast is typically ignored in speech-to-speech (S2S) translation due to the compartmentalized nature of the translation process. Conventional S2S systems rely on extracting prosody dependent cues from hypothesized (possibly erroneous) translation output using only words and syntax. In contrast, we propose the use of factored t...
متن کاملA Novel Qualitative State Observer
The state estimation of a quantized system (Q.S.) is a challenging problem for designing feedback control and model-based fault diagnosis algorithms. The core of a Q.S. is a continuous variable system whose inputs and outputs are represented by their corresponding quantized values. This paper concerns with state estimation of a Q.S. by a qualitative observer. The presented observer in this pape...
متن کامل